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Abelian periodicity of strings has been studied extensively over the last years. In 2006 
Constantincscu and Hie defined the abelian period of a string and several algorithms for 
the computation of all abelian periods of a string were given. In contrast to the classical 
period of a word, its abelian version is more flexible, factors of the word are considered 
the same under any internal permutation of their letters. We show two 0(\y\ 2 ) algorithms 
for the computation of all abelian periods of a string y. The first one maps each letter to 
a suitable number such that each factor of the string can be identified by the unique sum 
of the numbers corresponding to its letters and hence abelian periods can be identified 
easily. The other one maps each letter to a prime number such that each factor of the 
string can be identified by the unique product of the numbers corresponding to its letters 
and so abelian periods can be identified easily. We also define weak abelian periods on 
strings and give an 0(\y\log(\y\)) algorithm for their computation, together with some 
other algorithms for more basic problems. 

Keywords: strings; algorithms; abelian periods. 
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Introduction 

The notion of periodicity in strings is well studied in many fields like combinatorics 
on words, pattern matching, data compression and automata theory (see [251 26] ). 
because it is of paramount importance in several applications, not to talk about its 
theoretical aspects. 

A string u is a period of y, if y is a prefix of u k for some positive integer k (i.e. 
y is a prefix of uy). The period of y, denoted by Period(y), is the length of the 
shortest period of y. A lot of research has been concentrated on classical periods, 
e.g. algorithms for finding all periods of a string, algorithms for the computation of 
the period array of a string 23 , etc. 

Abelian periods are more flexible than classical ones and are defined in terms 
of Parikh vectors as in [15]. The Parikh vector of a string y, denoted by V(y), 
enumerates the cardinality of each letter of S in y. That is V[i — 1] is the cardinality 
of the ith letter of E in y, where 0<i<|S| — l.A string y is said to have an abelian 
period (h,p) if y — uo u i--- u fe-iUfc such that:7- > (ito) C V{u{) = ... = V(uk-i) D 
V{u k ) and |7>(«o)| = h, \V( Ul )\ = p. 

Abelian periodicity has been extensively studied over the last years 
[41 51 61 71 111 161 171 27] . Early efficient algorithms for abelian pattern matching were 
given in [181 19j and later some linear time algorithms have been designed in 
[91 101 14j . Recently Fici et al gave five algorithms for the computation of all abelian 
periods of a string [20]. They have proposed two off line algorithms, a brute force 
algorithm and one that uses a select array, that run in 0(|y| 2 |S]|) and three on- 
line algorithms, where the first two run in 0{\y\ 3 \T,\) and the other one runs m 
O ( | j/ 1 3 / o <7 ( 1 2/ 1 ) | XI | ) . Experimentally the off line algorithm that makes use of the se- 
lect array is said to be the fastest in practice. 

In this article, we show two 0(|y| 2 ) algorithms for the computation of all abelian 
periods of a string y. The first one maps each letter to a suitable number such that 
each factor of the string can be identified by the unique sum of the numbers corres- 
ponding to its letters. The other one maps each letter to a prime number such that 
each factor of the string can be identified by the unique product of the numbers cor- 
responding to its letters. We are then able to perform the required checks of parikh 
vectors, necessary to identify abelian periods, with just one operation. Additionally 
we define weak abelian periods on strings and give an 0(\y\log(\y\)) algorithm for 
their computation. Some other algorithms for basic problems on identification of 
periods which form the basis of the previous ones are also analyzed. 

The rest of the article is structured as follows. In Section [TJ we present the 
basic definitions used throughout the article and we define the problems solved. 
In Section [21 we prove some properties of abelian periods, Parikh vectors and their 
relation to the S-signature and P-signature of factors of the string and we also quote 
some properties of prime numbers which are used later for the design and analysis 
of the provided algorithms. In Section [3j we describe our algorithms for solving the 
stated problems. Finally, we briefly conclude, and give some future proposals in 
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Section |H 



1. Definitions and Problems 

We define an alphabet £ as a finite, non-empty set of symbols. An ordering can be 
defined via a bijection </>:£ — >{1,2,...,|£|}. Throughout this article we consider a 
string y, \y\ — n, composed by letters drawn from an alphabet £ = {Si, £2, • • • > ^<x}i 
where |£| = a < n. It is represented as y[0 . .n — 1]. A string w is a factor of y if 
y = uwv for two strings u and v. It is a prefix of y if u is empty and a suffix of y 
if v is empty. A string u is a border of j/ if u is both a prefix and a suffix of y. The 
border of y, denoted by Border(y), is the length of the longest border of y. A string 
u is a period of y, if y is a prefix of u k for some positive integer k (i.e. y is a prefix 
of uy). The period of y, denoted by Period(y), is the length of the shortest period 
of y. 

Definitions relative to Parikh vectors are as in [151 20) . The Parikh vector of a 
string y, denoted by V(y), enumerates the cardinality of each letter of £ in y. That 
is V[i— 1] is the cardinality of the ith letter of £ in y, where < i < a — 1. We denote 
by V y (i,m) the Parikh vector of the factor of y of length m starting at position i. 
The sum of the components of a Parikh vector is denoted by \V\. Given two Parikh 
vectors V, Q we write V C Q if V[i] < Q[i], for every < i < a - 1 and \V\ < \Q\. 
A string y is said to have an abelian period (h,p) if y = UQiii...Uk-iUk such that: 

• V{u Q ) C P(ui) = ... - P(ti*_i) D V(u k ) 
. |7>(« )| = fc, |PWI=P 

Factors uo an d u^. arc called the head and the tail of the abelian period respectively. 
A string y is said to have a weak abelian period p if y = UQU\...Uk~iUk such that: 

• V(u ) = Vim) = ... = P(ujk_i) d V(u k ) 

• |P(«o)|=p 

Example 1. String y — caabbacabbca has (2,5) as an abelian period (see Figure 
QP and 5 as a weak abelian period (see Figure^). 



caabbacabbca 



caabbacabbca 



Figure 1: (2,5) is an abelian period of y Figure 2: 5 is a weak abelian period of y 



A natural order can be defined on abelian periods as follows: let (h,p) and (h' ,p') 
be abelian periods of a string y, then (h,p) < (h' ,p') if p < p' or (p = p' and h < h'). 
Given a mapping p : £ — » A, where A is the set of the first a prime numbers, such 
that p(£i) = ith prime number, the P-signature of a word y is defined to be equal 
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t° ni=o 1 P(y[^\)- We remind the reader that a prime is a positive integer greater 
than 1 having exactly one positive divisor other than 1. 

Given a mapping s : £ — > B, where B is the set of the first a — 1 powers of n + 1 
and 0, such that: 

[(n+l) 1 ; , otherwise 

, the S-signature of a word y is defined to be equal to X^o" 1 S (2/W)- 
The array Pr, where Pr[i] — XYj =0 p{y\j\), is useful in computing the P-signature 

of substrings of y, as: 

The array 5, where = X)j=o s (yb1)> i s useful in computing the S-signature of 
substrings of y, as: 



I5[fc], 9 = 



We consider the following problems: 

Problem 1 (Abelian period decision) Decide if (h,p), where < h < 
min(p, L=j^-J + 1) an d 1 < p < n, is an abelian period of some string y. 

Problem 2 (String- Abelian period decision) Decide if a string x, where \x\ — 
m < n, composed from the same alphabet S as a string y can be an abelian period 
of y, i.e. there exist an abelian period (h,p) of y such that y[h..h + p — 1] is a 
permutation of x. 

Problem 3 (String- Abelian periods) Output all abelian periods (h,p) ofy such 
that y[h . . h + p — 1] is a permutation of a string x, where \x\ — m < n and x is 
composed from the same alphabet S as y. 

Problem 4 (Computing all weak abelian periods of a string) Compute all 
weak abelian periods of some string y. 

Problem 5 (Computing all abelian periods of a string) Compute all 
abelian periods of some string y. 
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2. Properties 

In this section, we prove some useful properties for abelian periods and we also quote 
some fundamental properties of primes that prove to be useful for the analysis of 
our algorithms. 

Theorem 2. (Fundamental Theorem of arithmetic )[2Tj 

Every positive integer, except 1, can be represented in exactly one way apart from 
permutations as a product of one or more primes. 

Theorem 3. (Prime Number Theorem)[21 

7r(n) ~ jjp-, where n{n) is the number of primes less than n. 

Corollary 4. |21] p n ~ nlogn, where p n is the nth prime number. 

Theorem 5. [5] There exist an algorithm that gives the prime numbers up to a 
natural number N in time 0(-, — j- at )- 

V log log N 1 

Theorem 6. [21 lim X)i=i ^ — ^ m (n) + 1, where 7 is the Euler-Mascheroni con- 
stant. 

Lemma 7. There exist an algorithm that gives the first n primes in time 

0( , r ). 

v log log(?i log n) / 

Proof. Immediate consequence of Theorem [3] and Corollary [4] □ 

Lemma 8. Two strings x, y of same length are represented by the same Parikh 
vector iff they share the same P-signature. 

Proof. Immediate consequence of Theorem [2j □ 

Lemma 9. Two strings x, y of same length are represented by the same Parikh 
vector iff they share the same S-signature. 

Proof. Direct: Suppose x and y are strings of the same length and share the same 
S-signature, i.e.: 

S-signature(x) = J^^o 1 S ( X M) = J2i=a a i( n + 1)' 
S-signature(y) = X)i=o 1 = Ei=o b ^ n + l f 

, where a, is the cardinality of Xj+i in x and 6, is the cardinality of Sj+i in y. 
W.l.o.g. consider k > q. 

S-signature(y) = Y^il^ 1 S (2/W) = SLo b ^ n + 1 ) < ^ n i n + 1 ) <? as 6^ < n 

and so S-signature(y) < (n + l) q+1 

Therefore q — k and by using similar arguments: 

E!=o + 1)' < n(n + l)^ 1 <(n + 1)« and so a k = b q . 

Similarly it follows that aj — bj for every j G {0, 1, . . . , k}. 

Reverse: Trivial □ 
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Lemma 10. Let rs[i] — minimum j such that V(y[0. .i — 1]) C V(y[i. .j]), where 
i G {1,2,..., n}. ThenV(y[0..i-l]) C V{y[i . . q]) for all q G {rs[i],rs[i]+l,...,n- 
1} and rs[i] < rs[i + 1] for all i G {1, 2, . . . , n — 1}. 

Proof. First part: 

Let rs[i] = minimum j such that V(y[0..i — 1]) C V(y[i ■ ■]]) for some i G 
{l,2,...,n}. Then for q G {rs[i],rs[i] + l,...,n - 1} holds that V{y[i..q]) = 
V(y[i..rs[i]}) +V(y[rs[i] +l..q\) and hence V{y[0..i - 1]) C V(y[i ..q). 
Second part: 

By definition rs[i + 1] = minimum j such that V(y[0 . . i]) C V(y[i + 1 ■ - j]) 

= minimum j such that V{y[Q .. i — 1]) C V(y[i .. j]) +max(0, minimum k such 

that (P(y[i + l..i + k])- V(y[0 . .i]))[y[i]] > - rs\i}) > rs[i}. □ 

Lemma 11. Let re[i] — maximum j such that V{y[n — i..n—l\) C V(y[j . .n — 
i — 1]), where i £ {1, 2, . . . , n}. Then V(y[n — i . . n — 1]) C V(y[q . . n — i — 1]) /or 
aZ/ g G {re[i], re[i] — 1, . . . , 0} and re[i] > re[i + 1] for all i G {1,2, ... , n — 1}. 

Proof. Similar to the proof of Lemma HU1 □ 
3. The algorithms 

In this section, we describe our algorithms for solving Problems [T][5] Firstly we 
describe some data structures that are used throughout the algorithms. Then we 
show how to solve the more basic problems and we extend these ideas to solve 
Problem |4] and Problem [5l ending with some comments on the analysis of the given 
algorithms. 

3.1. Preprocessing 

Before proceeding with the algorithms we will need some preprocessing to compute 
the following: 

• The S-signature of each prefix of y is precomputed and stored in array S, 
such that S[i] = S-signature(y[0 . .i]) for < i < n — 1. The necessary 
powers of n + 1 can be computed in 0{o~) time and stored in an array s.t 
they don't have to be computed every time they are called. Then we fill 
the array using the properties S[0] — s(y[0}) and S[i] = S[i — 1] + s(y[i]) 
for 1 < i < n — 1. 

• The P-signature of each prefix of y is precomputed and stored in array Pr, 
such that Pr[i] — P-signature(y[0 . .i]) for < i < n — 1. We assume that 
the necessary primes can be easily found from a library in the computer. 
Otherwise we can produce them fast using a prime sieve as in [3] (see also 
Theorem[5]). Then we fill the array using the properties Pr[0] — p(y[0\) and 
Pr[i] = Pr[i — l]p(y[i]) for 1 < i < n — 1. 
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• The array rs, where rs[i] = minimum j such that V(y[0 ■ . i — 1]) C 
V{y[i . . j]) is computed in linear time using the properties rs[i + 1] > rs[i] 
and re[i + 1] < re [i] (Lemma 1101 ). We use a simple sliding window approach 
keeping V(y[0 . . rs[h — 1]]) — V(y[0 . .h — 1]) in PV array. When trying to 
find rs[h], if PV[<fi[y[h — I]]] > then rs[h] = rs[h— 1], otherwise we search 
for y[h — 1] in {?/[rs[/i] . .n — 1} and use that length as an answer, if not 
found we assign n to rs[h] (Lemma llOl ). 

• The array re, where re[i] — maximum j such that V(y[n — i . . n — 1]) C 
V{y[j . . n — i — 1]) is computed in a similar manner to the way we compute 
rs so we only give the algorithm to find rs. 



ALGORITHM RS(y, n,a 


0) 




1 


rs[0] <- 0; 






2 


for i i — to <t — 1 do 






3 


PV[i] <- 0; 






4 


for h -s- 1 to do 






5 


PF[4y[/i - I]] - 1] +- 


- PV[^[y[h - 1]] - 1] - 




6 


if (PVMy[h - 1]] - 


1] > 0) or (rs[h - 1] = 


= n)) then 


7 


rs[h] <— rs[h — 1]; 






8 


else 






9 


g ^— rs[ft, — 1]; 






10 


while (PF[<?%[/l - 


1]] - 1] ^ 0) and q < 


n 1 do 


11 


q+- q + 1; 






12 


pv[4>W]] - 1] «- 


- PV[4>[y[q}] - 1] + 1; 




13 


if pk[*[/i - 1]] - 


1] ^ then 




14 


rs[/i] ^— n; 






15 


else 






16 


rs[/i] ^- q; 







3.2. Preliminary problems 

In this section, we describe algorithms for solving Problems QJ3] These problems are 
quite basic and our algorithms for Problem U and Problem [S] use similar ideas. The 
weak abelian period version of the first two problems is solved in the same manner. 

Problem[T]is solved in 0(n) time by checking the required conditions for (h,p) to 
be an abelian period, i.e. the necessary Parikh vectors, using either the S-signature 
or the P- signature of factors of y (as of Lemmas [8] and |9|) . A careful sliding window 
implementation would also be able to solve the problem in 0(n) time. 
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Problem [5] is solved in 0(n) time by the following steps: 

• If V{x) (JL V(y[0 . . 2\x\ — 1]) then the answer is immediately no. 

• We calculate the array Pr or S for rapid comparison of Parikh vectors. 

• We check each (h, m) , where < h < minim, \ ^-^\ +1), if it is an abelian 
period of y until we find the first one that is. Clearly we go over at most 
I —J factors during each period check. We check at most m different periods 
and hence the algorithm is linear. 

Problem [3] is solved in the same way but in the last step we keep checking for 
abelian periods after we find the first one. Clearly we go over at most \^-\ factors 
during each period check. We check m different periods and hence the linearity of 
the algorithm. 



3.3. Identifying all weak abelian periods 

This algorithm uses basic ideas from the above preliminary algorithms to solve 
Problem[4j Before proceeding with the algorithm the S-signature or the P-signature 
of each prefix of y is precomputed and stored in the array S or the array Pr 
respectively. We also precomputc the array re, where re[i] = maximum j such that 
V(y[n — i..n—l]) C V(y[j . . n — i — 1]) in linear time using the properties of Lemma 
1111 We only show the version of the algorithm that uses the S-signature as it is 
almost the same as with the version using the P-signature. 



ALGORITHM All- Weak- Abelian-Periods-S(?/, n,S,re) 

1: for p <— 1 to n do 

2: if p > n — re(n mod p) — n mod p then 

3: i «- 1; 

4: while ((£< |J) and (£[(* + 1) *p- 1] - S[ip- 1]) = S\p- 1]) do 

5: i <- i + 1; 

6: if i = 2 then 

7: Output p] 



Theorem 12. Algorithm All- Weak- Abelian-Periods-S runs in time 
0(nlogn). 

Proof. Computation of the arrays S and re is done in linear time as it is easy 
to see that each letter is checked at most once during that phase of preprocessing. 
During the execution of the main algorithm we go over only from some factors of y 
which are checked at most once. These determine the complexity of our algorithm: 
S'=i factors of y of length i that are checked < X)i=i L?J — n Sl=i 7 
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Theorem [6] states that lim ^i=i ^ hi(n) + 7, where 7 is the Euler-Mascheroni 
constant, and so we get the above result. □ 

Theorem 13. Algorithm All- Weak- Abelian-Periods-S has 6(n) best case 
running time. 

Proof. Consider an alphabet E. It is easy to see that the word y — E[1]E[2] . . E[cr], 
where E[i] is the ith letter of E, has no abelian periods. On executing our algorithms 
re is full of —1 and therefore we never enter the if part of the main loop of the 
algorithm, thus only counting from p <— 1 to n. No better running time is possible 
as preprocessing needs 0(n) time. □ 

3.4. Identifying all abelian periods 

We propose two algorithms for the solution of Problem [5] The first one maps each 
letter to a suitable number such that each factor of the string can be identified 
by the unique sum of the numbers corresponding to its letters (S- signature). The 
other one maps each letter to a prime number such that each factor of the string 
can be identified by the unique product of the numbers corresponding to its letters 
(P- signature). We are then able to perform the required checks of parikh vectors, 
necessary to identify abelian periods, with just one operation using ideas from 
algorithms from the preliminary problems. 

3.4.1. S-Signature algorithm 

This algorithm makes use of the S-signature of factors of y in order to make rapid 
comparison of Parikh vectors. It takes as input the string y, its length n and the 
arrays 5, rs and re and outputs all the abelian periods of y in the required encoding. 
For each possible h from to [^r^J we check all possible values of p from rs(h) — h+l 
to n — h. For (h,p) to be an abelian period we need: 

(1) V(y[0..h-l])cV(y[h..h + P -l]), 
i.e. p > rs(h) — h + 1. 

(2) V(y[h..h+p-l]) = V(y[h+p..h+2p-l}) = ■■■ =V(y[h+(((n-h) mod p) - 
l)p . . h + ((n — h) mod p)p — 1]), 

i.e. (S[(i + l)*p + h-l\- S[ip + h- 1]) = S[p + h - 1] - S[h - 1]) for all 
ie{2,3,...,((n- h) mod p) - 1}. 

(3) V(y[h+((n-h) mod p)p..n- 1]) C V{y[h..h + p- 1]), 
i.e.p > n — re((n — h) mod p) — (n — h) mod p. 

Theorem 14. Algorithm All-Abelian-Periods-S runs in 0(n 2 ) time. 

Proof. Computation of the arrays S, rs and re is done in linear time as it is easy 
to see that each letter is checked at most once during that phase of preprocessing. 
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ALGORITHM All-Abelian-Periods-S(j/, n,S,rs,re) 
1: for ft to do 

for p <— rs(ft) to n — ft do 

if p > n — re((n — ft) mod p) — (n — ft) mod p then 

i <- 1; 

while ((i < ) and + l)*p + ft-l]- S[ip + ft - 1]) 

% + ft-l] - £[&-!]) do 



i «- i + 1; 



if i 



-J 



, then 



Output (ft,p); 



The necessary powers of n+ 1 arc computed in 0(c) time, where a < n. During the 
execution of the main algorithm all the factors of y ) are checked at most 

once which gives time complexity 0(n 2 ). □ 



3.4.2. P-signature algorithm 

This algorithm makes use of the P-signature of factors of y in order to make rapid 
comparison of Parikh vectors. It takes as input the string y, its length n and the 
arrays Pr, rs and re and outputs all the abelian periods of y in the required en- 
coding. For each possible h from to [^^J we check all possible values of p from 
h + 1 to n — h. For (h,p) to be an abelian period we need: 

(1) V(y[0..h-l])cV(y[h..h + p-l]), 
i.e. p > rs(h) — h + 1. 

(2) V(y[h..h+p-l\) =V(y[h+p..h + 2p-l\) = • • • = V(y[h+ (((n- h) mod p) - 
l)p . . /i + ((n — /i) mod p)p — 1]), 

i.e. (Pr[(i + l)*p + h- 1]/Pr[ip + h - 1]) = Pr[p + ft - 1]/Pr[ft - 1]) for all 
i € {1, 2, . . . , ((n - ft) mod p) - 1} 

(3) P(y[ft + ((n - ft) mod p)p..n- 1]) C P(y[ft . . ft + p - 1]), 
i.e. p > n — re((n — ft) mod p) — (n — ft) mod p. 

Theorem 15. Algorithm All-Abelian-Periods-P runs in 0(n 2 ) time. 

Proof. Computation of the arrays Pr, rs and re is done in linear time as it is easy 
to see that each letter is checked at most once during that phase of preprocessing. 
During the execution of the main algorithm all the factors of y ( " ( " 2 +1 - ) ) are checked 
at most once which gives time complexity 0{n 2 ). □ 



3.4.3. An example 

We provide an example, providing the data structures build for the execution of 
our algorithm on the string y — acabbacabbca. 
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ALGORITHM All-Abelian-Periods-P(t/, n,Pr) 
l: for h <- to do 

for p <— rs(h) to n — h do 
if p > rs(h) — h + 1 then 
i <- 1; 

while ((z < s^J) and (Pr[(i + 1) * p + h - 1]/Pr[ip + h - 1]) 
Pr[p + /i - 1}/Pr[h - 1]) do 



if i 



i — h 



then 



Output (h,p); 



i 





123456 7 8 9 10 11 


y[i] 


a 


cabbaca b b c a 


s[i\ 





13 13 14 15 15 28 28 29 30 43 43 


Pr[i] 


2 


10 20 60 180 360 1800 3600 10800 32400 162000 324000 


rs[i] 





2 6 7 7 9 12 12 12 12 12 12 


re[i] 


11 


7 6 6 3 2 -1 -1 -1 -1 -1 -1 



In order to calculate the P-signature of factors of y we use the mapping p : {a, b, c} — > 

{2,3,5}, such that: 

p(a) = 2 p(b) = 3 p(c) = 5 

In order to calculate the S-signature of factors of y we use the mapping s : {a, b, c} — > 

{0, 1, 13}, such that: 

s(a) = s(b) = 1 s(c) = 13 

All abelian periods of y are: 

(0,5), (0,7), (0,8), (0,9), (0,10), (0,11), (0,12), (1,6), (1,7), (1,8), (1,9), (1,10), 

(1,11), (2,5), (2,6), (2,7), (2,8), (2,9), (2,10), (3,5), (3,6), (3,7), (3,8), (3,9), 
(4,5), (4,6), (4,7), (4,8), (5,6), (5,7) 



3.5. Further comments on the complexity of the above algorithms 

In this subsection we give more details on the complexity of the suggested al- 
gorithms. We claim that they arc optimal under the natural encoding suggested 
by the definition of the abelian period and that they have a best case linear run- 
ning time. We also observe that a large alphabet size may lead to the creation of 
large numbers during the execution of our algorithms. However when dealing with 
applications a is very small compared to n and so our algorithms are efficient. 

Theorem 16. Algorithm All-Abelian-Periods-P and Algorithm All- 
Abelian-Periods-S are optimal. 
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Proof. Consider the word a". As suggested in [20] it has 0(n 2 ) abelian periods, 
which is also the worst case running time of our algorithms. . □ 

Theorem 17. Algorithm All-Abelian-Periods-P and Algorithm All- 
Abelian-Periods-S have fl(n) best case running time. 

Proof. Consider an alphabet E. It is easy to see that the word y = E[1]E[2] . . E[cr], 
where E[i] is the ith letter of E, has no abelian periods. On executing our algorithms 
rs is full of n and therefore we never enter the second loop of the algorithm, thus only 
counting from h <— to [^^T^J ■ No better running time is possible as preprocessing 
needs 0(n) time. □ 

As mentioned before a large alphabet size may lead to the creation of large 
numbers during the execution of our algorithms. In particular it is the signatures of 
the factors that might grow too large. The following theorems show the worst case 
size that they can have. 

Theorem 18. The number of digits of variables used during the execution of Al- 
gorithm All-Abelian-Periods-P is 0{n\og{ Xo 3 

Proof. Consider an alphabet E. 

The biggest variable encountered during the execution of the algorithm is the 
P-signature of the word y = (E[cr|]) n , where E[i] is the ith letter of E. 
That means P-signature{y) = (i t h prime number)™. 

As suggested by Corollary2J P-signature(y) is (9(( log ^ CT ^ )"). □ 

Theorem 19. The number of digits of variables used during the execution of Al- 
gorithm All-Abelian-Periods-S is 0(a log(n)). 

Proof. Consider an alphabet E. 

The biggest variable encountered during the execution of the algorithm is the 

S-signature of the word y = (E[cr|]) n , where E[i] is the ith letter of E. 

That means S-signature(y) = n(n + l) ff ~ 2 □ 

Fortunately the numbers formed when we execute Algorithm All-Abelian- 
Periods-P can be further reduced by taking logarithms of the signatures as shown 
in the definitions below: 

• The P' -signature of a word y is defined to be equal to log(J}^J 1 p(y[i])). 

• The array Pr' is given by Pr'[i] = log(JX* =0 p(y[j])) 

The array Pr' is useful in computing the P' -signature of substrings of y, as: 

P' -signature(y[q . . k}) = < ^ ^ > 9^ ^ 

I Pr[k], q = 
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As before the array Pr can be easily calculated using the properties JV'[0] = 
\og(p(y[0})) and Pr'[i] = Pr'[i — 1] + log(p(y[i])) for 1 < i < n — 1. As of Theorem 
IT51 the number of digits of variables used during the execution of Algorithm All- 
Abelian-Periods-P is 0(log(n log( log ^ g ) ))) = O(logn), while the running time of 
the algorithm is the same. 

4. Conclusion-Further Work 

Parikh vectors have found applications in bioinformatics, particularly in mass spec- 
trometry data, DNA alignment, SNP discovery, repeated pattern discovery and gene 
clusters [9]. Recently, Constantinescu and Hie [15] defined the abelian period of a 
string and several algorithms for the computation of all abelian periods of a string 
were given by Fici et al|20j. 

In this article, we have provided two 0(n 2 ) time algorithms for computing all 
abelian periods of a given string. We have also introduced the notion of the weak 
abelian period and we gave an 0(n log n) algorithm for the computation of all weak 
abelian periods of a given string. Additionally we have analyzed simpler problems 
for the identification of abelian periods in strings and gave linear time algorithms for 
their solution. Our algorithms make extensive use of the P-signature and S-signature 
of factors of the string, thus being able to quickly compare Parikh vectors. 

Further work can be done on designing a faster algorithm for the computation 
of all weak abelian periods of a string, or as suggested by Fici et al[20] on designing 
an algorithm for the computation of the Abelian Period array of a given string, 
i.e. computing the shortest abelian period of each prefix of a string as for its clas- 
sical analog in |23j . Additionally further work can be done on algorithms around 
abelian quasiperiodicities, thus extending work done on classical ones, e.g. covers 
and seeds |ll 21 81 121 131 241 22) . Variants of those algorithms are very likely to find 
applications in other areas such as bioinformatics. 
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